33 research outputs found

    Features for audio and music classification

    Get PDF
    Four audio feature sets are evaluated in their ability to classify five general audio classes and seven popular music genres. The feature sets include low-level signal properties, mel-frequency spectral coefficients, and two new sets based on perceptual models of hearing. The temporal behavior of the features is analyzed and parameterized and these parameters are included as additional features. Using a standard Gaussian framework for classification, results show that the temporal behavior of features is important for both music and audio classification. In addition, classification is better, on average, if based on features from models of auditory perception rather than on standard features

    Binary Biometrics: An Analytic Framework to Estimate the Performance Curves Under Gaussian Assumption

    Get PDF
    In recent years, the protection of biometric data has gained increased interest from the scientific community. Methods such as the fuzzy commitment scheme, helper-data system, fuzzy extractors, fuzzy vault, and cancelable biometrics have been proposed for protecting biometric data. Most of these methods use cryptographic primitives or error-correcting codes (ECCs) and use a binary representation of the real-valued biometric data. Hence, the difference between two biometric samples is given by the Hamming distance (HD) or bit errors between the binary vectors obtained from the enrollment and verification phases, respectively. If the HD is smaller (larger) than the decision threshold, then the subject is accepted (rejected) as genuine. Because of the use of ECCs, this decision threshold is limited to the maximum error-correcting capacity of the code, consequently limiting the false rejection rate (FRR) and false acceptance rate tradeoff. A method to improve the FRR consists of using multiple biometric samples in either the enrollment or verification phase. The noise is suppressed, hence reducing the number of bit errors and decreasing the HD. In practice, the number of samples is empirically chosen without fully considering its fundamental impact. In this paper, we present a Gaussian analytical framework for estimating the performance of a binary biometric system given the number of samples being used in the enrollment and the verification phase. The error-detection tradeoff curve that combines the false acceptance and false rejection rates is estimated to assess the system performance. The analytic expressions are validated using the Face Recognition Grand Challenge v2 and Fingerprint Verification Competition 2000 biometric databases

    Pseudo Identities Based on Fingerprint Characteristics

    Get PDF
    This paper presents the integrated project TURBINE which is funded under the EU 7th research framework programme. This research is a multi-disciplinary effort on privacy enhancing technology, combining innovative developments in cryptography and fingerprint recognition. The objective of this project is to provide a breakthrough in electronic authentication for various applications in the physical world and on the Internet. On the one hand it will provide secure identity verification thanks to fingerprint recognition. On the other hand it will reliably protect the biometric data through advanced cryptography technology. In concrete terms, it will provide the assurance that (i) the data used for the authentication, generated from the fingerprint, cannot be used to restore the original fingerprint sample, (ii) the individual will be able to create different "pseudo-identities" for different applications with the same fingerprint, whilst ensuring that these different identities (and hence the related personal data) cannot be linked to each other, and (iii) the individual is enabled to revoke an biometric identifier (pseudo-identity) for a given application in case it should not be used anymore

    Effect of perceptually irrelevant variance in head-related transfer functions on principal component analysis

    No full text
    Abstract: The significant amount of variance in head-related transfer functions (HRTFs) resulting from source location and subject dependencies have led researchers to use principal components analysis (PCA) to approximate HRTFs with a small set of basis functions. PCA minimizes a mean-square error, and consequently may spend modeling effort on perceptually irrelevant properties. To investigate the extent of this effect, PCA performance was studied before and after removal of perceptually irrelevant variance. The results indicate that from the sixth PCA component onward, a substantial amount of perceptually irrelevant variance is being accounted for

    ANALYSIS AND SYNTHESIS OF BINAURAL PARAMETERS FOR EFFICIENT 3D AUDIO RENDERING IN MPEG SURROUND

    No full text
    Reprinted from Breebaart, J. ”Analysis and synthesis of binaural parameter

    Modeling binaural signal detection

    Get PDF
    With the advent of multimedia technology and powerful signal processing systems, audio processing and reproduction has gained renewed interest. Examples of products that have been developed are audio coding algorithms to efficiently store and transmit music and speech, or audio reproduction systems that create virtual sound sources. Usually, these systems have to meet the high audio quality of e.g. the compact-disc standard. Engineers have become aware of the fact that signal-to-noise ratios and distortion measures do not tell the whole story when it comes to sound quality. As a consequence, new algorithms have to be evaluated by extensive listening tests. Drawbacks of this method of evaluation are that these tests are expensive and time consuming. Moreover, listening tests usually do not give any insight why a specific algorithm does or does not work. Hence there is a demand for objective and fast evaluation tools for new audio technologies. One way to meet these demands is to develop a model of the auditory system that can predict the perceived distortion and which can indicate the nature of these distortions. This thesis describes and validates a model for the binaural hearing system. In particular, it aims at predicting the audibility of changes in arbitrary binaural stimuli. Two important properties for binaural hearing are interaural intensity differences (IIDs) and interaural time differences (ITDs) present in the waveforms arriving at both ears. These interaural differences enable us to estimate the position of a sound source but also contribute to our ability to detect signals in noisy environments. Hence one of the most important objectives for a comprehensive model is its ability to describe the sensitivity for interaural differences in a large variety of conditions. The basis of the model relies on psychoacoustic experiments that were performed with human listeners. In one series of experiments, subjects had to detect the presence of interaural cues for various statistical distributions of the IIDs and ITDs. The results revealed that the energy of the difference of the signals arriving at both ears following a peripheral filtering stage can successfully describe the sensitivity for interaural time and intensity differences. This approach is very similar to Durlach’s EC theory. Furthermore, other listening experiments with varying degrees of stimulus uncertainty revealed that the detection process of the binaural auditory system may well be simulated as a template-matching procedure. The idea of template matching based on the energy of the difference signal was incorporated in a time-domain detection model. This model transforms arbitrary stimuli into an internal representation. This representation comprises four dimensions: time, frequency channel, internal interaural delay and internal interaural level adjustment. The internal model activity as a function of these dimensions entails both binaural and monaural properties of the presented stimuli. The accuracy of these properties is limited by the addition of internal noise and by the limited frequency and time resolution incorporated in various stages of the model. An important model feature is the ’optimal detector’. This optimal detector analyzes the internal representation of the presented waveforms and extracts information from it, for example the presence or absence of a signal added to a masker. This process entails a strategy that optimally reduces the internal noise by integrating information across time and frequency channels. The model was tested for its ability to predict thresholds as a function of spectral and temporal stimulus parameters. During all simulations, all model parameters were kept constant. The results revealed that the model can account for a large variety of experimental data that are described in the literature. The most prominent finding was that the model can quantitatively account for the wider effective critical bandwidth observed in band-widening NoS?? experiments. This wider effective bandwidth is found if the threshold of audibility is measured for interaurally out-of-phase signals (S??) added to band-limited interaurally in-phase noise (No). In our model, this phenomenon is the result of the fact that the cue for detection is available in a range of filters if the masker bandwidth is sufficiently small. The increased effective bandwidth does therefore not reflect a worse binaural spectral resolution compared the monaural spectral resolution but it follows from the ability to integrate information across frequency. It was also shown that the optimal detector can account for effects found by manipulating temporal stimulus properties. To be more precise, the model can account for the phenomenon that the temporal resolution of the binaural auditory system obtained from stimuli with time-varying interaural correlations seems to be worse for sinusoidally-varying cross correlation than for rectangular correlation modulations. To extend the model’s predictive scope towards more natural listening conditions, experiments were performed with virtual sound sources, which were generated by using head-related transfer functions (HRTFs). The complexity of these impulse responses was gradually decreased by a spectral smoothing operation. During listening tests, subjects had to rate the audibility of this operation. The results revealed that the fine structure of HRTF phase and magnitude spectra is relatively unimportant for the generation of virtual sound sources in the horizontal plane. The same experiment was subsequently simulated with the model. Comparisons between subject data and model predictions showed that the model could not only predict whether the HRTF smoothing was audible or not, but that it could also predict the amount of perceptual degradation for supra-threshold HRTF smoothing

    Modeling binaural signal detection

    No full text
    iv+198hlm.;24c
    corecore